Statistical analysis for exploring environmental Citizen Science practices at ILTER

This document presents the dataset and the statistical analysis used in the paper Bergami et. al (2022) DOI for exploring environmental Citizen Science practices at ILTER, starting from the results of a global survey.

true , true
June 18, 2022

The dataset

Show code
## import dataset
dataset <- readxl::read_excel("ILTER and Public Engagement_October2020_secondPart.xlsx")
dataset$age <- as.numeric(format(Sys.Date(), "%Y")) - dataset$Q33
rmarkdown::paged_table(dataset, options = list(rows.print = 15))

If you want to download the file, please visit the GitHub repo and remember to cite this article (oggioni2022?) or the dataset (oggioni2022?) (CHANGE IT with the citation of the dataset!!) if you want to use them for other publications or analysis.

General Information

The link to the survey was sent to all the ILTER site managers through the ILTER secretariat contact list (850 email recipients). The questionnaire remained open from the end of February to mid-September 2020 with two reminders sent within this period. In total, we received 163 responses with a completeness higher than or equal to the 50% (respondents completed at least the 50% of the survey). Based on an estimated 850-1000 participating scientists, our response rate is 16-19%.

The number of persons who accessed the survey (not necessarily finished it):

Show code
totalAnswers <- dataset %>% 
  dplyr::filter(Finished == 'True' | Finished == 'False') %>% dplyr::count()

296

The response rate is:

Show code
respRate <- round((totalAnswers/750)*100, 2)

39.47 %

Pool for the second part of the survey: number of answers with a completeness >= 75 % between column Q10-Q30:

Show code
CSPool <- dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  # dplyr::filter(Finished == 'True') %>% # = 75
  dplyr::count()

77

The number of persons who finished the survey (no information about the completeness):

Show code
completeAnswers <- dataset %>% 
  dplyr::filter(Finished == 'True') %>% dplyr::count()

142

The number of answers with a completeness >= 50 %:

Show code
halfOfAnswers <- dataset %>% dplyr::filter(as.numeric(Progress) >= 50) %>% dplyr::count()

163

The number of answer where the reference to the ILTER site, via DEIMS.ID, was NOT indicated:

Show code
noDEIMSAnswers <- dataset %>% 
  dplyr::filter(Q30 == 'NA') %>% dplyr::count()

201

The number of answer where the reference to the ILTER site, via DEIMS.ID, was indicated:

Show code
DEIMSAnswers <- dataset %>% 
  dplyr::filter(Q30 != 'NA') %>% dplyr::count()

95

The number of answers with DEIMS.iD among them:

Show code
CSPoolWithDEIMSID <- dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q30 != 'NA') %>% 
  dplyr::count()

52

The number of answers with LTER network information among them:

Show code
CSPoolWithLTERNetwork <- dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(!is.na('LTERNetwork')) %>% 
  dplyr::count()

77

The number of participants in CS initiative among respondents:

Show code
participationVSresponcesAll <- dataset %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::count()

96

The number of participants in CS initiative among respondents with a completeness >= 50 %:

Show code
participationVSresponces <- dataset %>% 
  dplyr::filter(as.numeric(Progress) >= 50) %>%
  dplyr::filter(Q10 > 0) %>% 
  dplyr::count()

90

The number of CS initiatives declared among respondents with a completeness >= 50 %:

Show code
csIntiatives <- dataset %>%
  dplyr::filter(as.numeric(Progress) >= 50) %>%
  dplyr::filter(Q10 > 0) %>%
  dplyr::select(Q10) %>%
  dplyr::summarise(sum(Q10))

392

Geographic distribution of responses

Show code
## Join with ILTER DEIMS GeoInfo
# Connect and download layers from LTER-Europe's GeoSever
fileName <- tempfile()
download.file("https://data.lter-europe.net/geoserver/deims/wfs?SERVICE=WFS&VERSION=1.0.0&REQUEST=GetFeature&TYPENAME=deims:ilter_all_formal&SRSNAME=EPSG:4326", fileName)
request <- rwfs::GMLFile$new(fileName)
client <- rwfs::WFSCachingClient$new(request)
ilter_all_formal <- client$getLayer("ilter_all_formal")
## Reading layer `ilter_all_formal' from data source 
##   `/private/var/folders/p1/110rx8q101z0wn0bwh4njrcw0000gn/T/RtmpsiBiAb/filec2edd801479' 
##   using driver `GML'
## Simple feature collection with 754 features and 6 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -156.5648 ymin: -78 xmax: 175.085 ymax: 79
## CRS:           NA
Show code
sitesOnSurvey <- ilter_all_formal[ilter_all_formal$deimsid %in% dataset$Q30, ]

htmltools::div(
    style = htmltools::css(width="100%", height='100%'),
    leaflet::leaflet(sitesOnSurvey) %>%
    leaflet::addTiles() %>%
    # addMouseCoordinates() %>%
    leaflet::setView(lng = 113.63962, lat = 23.16001, zoom = 1) %>%
    leaflet::addMarkers(
      clusterOptions = leaflet::markerClusterOptions(),
      popup = paste0(
              # "Name: <b>", sitesOnSurvey$name, "</b><br/>",
              "DEIMS.ID: <b><a target = 'blank' href = '", sitesOnSurvey$deimsid, "'>", sitesOnSurvey$deimsid, "</a></b><br/>"
            ),
       group = "Sites"
    ) %>%
    leaflet::addWMSTiles(
      'http://getit.lteritalia.it/geoserver/wms',
      layers = 'geonode:Zonobiome_poly',
      options = leaflet::WMSTileOptions(
        # styles = ,
        format = "image/png",
        transparent = T),
      group = "Biome"
    ) %>%
    leaflet.extras::addWMSLegend(
      position = 'topright',
      uri = 'http://getit.lteritalia.it/geoserver/wms?REQUEST=GetLegendGraphic&VERSION=1.0.0&FORMAT=image/png&WIDTH=20&HEIGHT=20&LAYER=geonode:Zonobiome_poly'
    ) %>%
    leaflet::addLayersControl(position = 'bottomright',
                              overlayGroups = c("Sites", "Biome"),
                              options = leaflet::layersControlOptions(collapsed = FALSE)
    )
)

Research questions

5a - How many citizen science initiatives at your LTER Site or LTSER Platform have you been involved in? This number should include current and past initiatives.

The percentage of participants in CS initiative among respondents with a completeness >= 50 % and the participants with a completeness >= 50 % of answers is:

Show code
partiRate <- round((participationVSresponces/halfOfAnswers)*100, 1)

55.2 %

The average of CS projects among the participants with a completeness >= 75 % in the second part of survey is:

Show code
average <- dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>%
  dplyr::select(Q10) %>%
  dplyr::summarise(average = round(mean(Q10, na.rm=TRUE), 1))

4.6

5b - Does participation in CS differ by gender of ILTER scientists?

Show code
participationCSDifference <- dataset %>%
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>%
  dplyr::select(c(Q10, Q31:Q36)) #%>% View()
participationCSDifference$age <- as.numeric(format(Sys.Date(), "%Y")) - participationCSDifference$Q33
# Q35 Gender
participationCSDifference %>%
  dplyr::group_by(Q35) %>% 
  # dplyr::summarise(totalCSInitiative = sum(Q10)) %>% 
  dplyr::count(Q35) %>% 
  dplyr::filter(n > 1) %>%
  ggplot2::ggplot(ggplot2::aes(x = Q35, y = n)) +
  ggplot2::geom_bar(stat = "identity", fill = "orange") +
  ggplot2::xlab("Gender") + ggplot2::ylab("Participants in CS initiatives") +
  ggplot2::geom_text(ggplot2::aes(label = n), vjust = 1.6, color = "white", size = 3.5)+
  ggplot2::theme_classic()
Show code
partiGender <- participationCSDifference %>%
  dplyr::select(Q10, Q35)
table1 <- gtsummary::tbl_summary(
  data = partiGender,
  label = list(
    Q10 = "CS projects declared by participant (Q10)",
    Q35 = "Gender participant(Q35)"
  )
)
table1
Characteristic N = 761
CS projects declared by participant (Q10) 3.0 (2.0, 4.0)
Gender participant(Q35)
Female 29 (39%)
Male 45 (61%)
Unknown 2
1 Median (IQR); n (%)
Show code
partiGender %>%
  gtsummary::tbl_summary(
    by = Q35, # split table by group
    missing = "no", # don't list missing data separately
    statistic = list(all_continuous() ~ "{mean} ({sd})"),
    label = list(Q10 = "CS projects declared (Q10)")
  ) %>%
  gtsummary::add_n() %>% # add column with total number of non-missing observations
  gtsummary::add_p() %>% # test for a difference between groups
  gtsummary::modify_header(label = "**Variable**") %>% # update the column header
  gtsummary::bold_labels()
Variable N Female, N = 291 Male, N = 451 p-value2
CS projects declared (Q10) 74 5.5 (10.1) 4.1 (4.7) 0.8
1 Mean (SD)
2 Wilcoxon rank sum test

7a - Spatial scale of the CS initiatives

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(c(Q10, Q13)) %>% # 73
  dplyr::count(Q13) %>% 
  dplyr::filter(n > 1) %>% 
  ggplot2::ggplot(ggplot2::aes(x = Q13, y = n)) +
  ggplot2::geom_bar(stat = "identity", fill = "green4") +
  ggplot2::xlab("") + ggplot2::ylab("Number of projects") +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::geom_text(ggplot2::aes(label = n), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::theme_classic()

7a - Temporal scale of the CS initiatives

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(c(Q12, Q56)) %>% # 73
  dplyr::group_by(Q12, Q56) %>%
  dplyr::summarise(freq = n()) %>% 
  dplyr::filter(!is.na(Q12)) %>% 
  ggplot2::ggplot(ggplot2::aes(x = Q12, y = Q56)) +
  ggplot2::geom_point(aes(size = freq), colour = "green4") +
  ggplot2::xlab("Is the project still active?") + 
  ggplot2::ylab("Number of projects") +
  ggplot2::labs(size = "n of years") +
  ggplot2::scale_size_continuous(
    breaks = c(2, 4, 6, 8),
    labels = c('<=2', '4', '6', '=>8')
  ) +
  ggplot2::theme_bw()

7b - Research focus

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(c(Q10, Q14)) %>% # 73
  dplyr::count(Q14) %>% 
  ggplot2::ggplot(ggplot2::aes(x = Q14, y = n)) +
  ggplot2::geom_bar(stat = "identity", fill = "green4") +
  ggplot2::xlab("Research focus") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = n), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::theme_classic()

7c - Participants number/year

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>%  
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q17, Q18:Q18_8_TEXT, Q19:Q19_4_TEXT) %>% # 73
  dplyr::group_by(Q17) %>%
  dplyr::summarise(numProj = n()) %>%
  dplyr::mutate(Q17 = forcats::fct_relevel(Q17, "Fewer than 25", "25-50", "51-100", "101-500", "More than 500", "NA")) %>% 
  ggplot2::ggplot(ggplot2::aes(x = Q17, y = numProj)) +
  ggplot2::xlab("n participants/year") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_bar(stat = "identity", fill = "blue4") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::theme_classic()

7c - Type of participants

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>%  
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q17, Q18:Q18_8_TEXT, Q19:Q19_4_TEXT) %>% # 73
  dplyr::group_by(Q18) %>%
  dplyr::summarise(numProj = n()) %>% 
  dplyr::filter(numProj > 3) %>%
  ggplot2::ggplot(ggplot2::aes(x = Q18, y = numProj)) +
  ggplot2::geom_bar(stat = "identity", fill = "blue4") +
  ggplot2::xlab("Type of participants") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::theme_classic()

7c - Undeserved communities

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q17, Q18:Q18_8_TEXT, Q19:Q19_4_TEXT) %>% # 73
  dplyr::group_by(Q19) %>%
  dplyr::summarise(numProj = n()) %>% 
  dplyr::mutate(Q19 = forcats::fct_relevel(
    Q19, 
    "Volunteers who have limited financial resources", "Volunteers who live in rural areas", "No undeserved communities", "Other"
  )) %>% 
  dplyr::filter(numProj > 6) %>% 
  ggplot2::ggplot(ggplot2::aes(x = Q19, y = numProj)) +
  ggplot2::geom_bar(stat = "identity", fill = "blue4") +
  ggplot2::xlab("Undeserved communities") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::theme_classic()

7d - Partecipation frequency

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q20) %>% # 73
  dplyr::group_by(Q20) %>%
  dplyr::summarise(numProj = n()) %>% 
  dplyr::filter(!is.na(Q20)) %>% 
  dplyr::mutate(Q20 = forcats::fct_relevel(Q20, 
                                    "Once", "Two to three times", "Four to six times", "More than six times")) %>% 
  ggplot2::ggplot(ggplot2::aes(x = Q20, y = numProj)) +
  ggplot2::geom_bar(stat = "identity", fill = "blue4") +
  ggplot2::xlab("Participation frequency") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::theme_classic()

7e - Type of involvement of the participants in the CS initiatives

Show code
matrix7e <- # dataset %>% 
  # dplyr::filter(as.numeric(Progress) >= 50) %>% 
  dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q21_1:Q21_13) %>%
  tidyr::gather(questions, typeOfInvol) %>% dplyr::group_by(questions, typeOfInvol) %>% dplyr::count() %>% dplyr::ungroup() %>% tidyr::spread(questions, n) %>% 
  t() %>% 
  data.frame(row.names(.), ., row.names = NULL) %>% 
  `colnames<-`(c('Activity in CS', 'High involvement', 'Moderate involvement', 'Not at all involved', 'Very high involvement', 'Very little involvement', 'NA')) %>% 
  .[-1,-1] %>% .[,-6] %>% 
  as.matrix() %>% 
  `rownames<-`(c(
    'Help define research questions',
    'Help interpret data and draw conclusions',
    'Help disseminate conclusions',
    'Help translate the results into action',
    'Help discuss results and ask new questions',
    'Help gather information and resources for research',
    'Help develop hypotheses',
    'Help design data collection methodologies',
    'Help collect samples or record data',
    'Help classify data',
    'Help process samples',
    'Help validate data', 
    'Help analyze data'
  )) %>% 
  reshape2::melt() 

matrix7e$Var2 <- factor(matrix7e$Var2, levels = c("Not at all involved", "Very little involvement", "Moderate involvement", "High involvement", "Very high involvement"))
matrix7e <- matrix7e[matrix7e$value!=0,]
matrix7e <- matrix7e[!is.na(matrix7e$value),]

ggplot2::ggplot(matrix7e, aes(x = Var2, y = Var1)) + 
  ggplot2::geom_raster(aes(fill = as.numeric(value))) + 
  ggplot2::scale_fill_gradient(low = "grey90", high = "red4", na.value = "grey10", guide = "colourbar") +
  ggplot2::labs(x = "Degree of Involvement", y = "Type of Involvement") +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::scale_y_discrete(labels = function(x) stringr::str_wrap(x, width = 30)) +
  ggplot2::labs(fill = "n of answers") +
  ggplot2::theme_classic() + ggplot2::theme(axis.text.x = element_text(size = 8, angle = 0, vjust = 0.3),
                          axis.text.y = element_text(size = 8),
                          plot.title = element_text(size = 11))
Show code
ggplot2::ggplot(matrix7e, aes(x = Var2, y = Var1)) + 
  # ggplot2::geom_raster() +
  # ggplot2::scale_fill_gradient(low = "grey90", high = "red") +
  geom_point(aes(size = as.numeric(value))) + #, colour = value)) +
  # scale_color_brewer(palette = "RdYlBu") +
  ggplot2::labs(x = "Degree of Involvement", y = "Type of Involvement", title = "Matrix") +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::scale_y_discrete(labels = function(x) stringr::str_wrap(x, width = 30)) +
  ggplot2::scale_size_continuous(
    breaks = c(5, 10, 15, 20, 25),
    labels = c('<=5', '10', '15', '20', '>25')
  ) +
  ggplot2::theme_classic() + ggplot2::theme(axis.text.x = element_text(size = 9, angle = 0, vjust = 0.3),
                          axis.text.y = element_text(size = 9),
                          plot.title = element_text(size = 11))
Show code
ggplot2::ggplot(matrix7e, aes(x = Var2, y = Var1, size = as.numeric(value), label = value)) +
  ggplot2::labs(x = "Degree of Involvement", y = "Type of Involvement", title = "Matrix") +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::scale_y_discrete(labels = function(x) stringr::str_wrap(x, width = 30)) +
  geom_text(size = as.numeric(matrix7e$value), aes(colour = as.numeric(matrix7e$value))) +
  ggplot2::labs(colour = "n of answers") +
  ggplot2::scale_colour_distiller(palette = "RdYlBu") +
  ggplot2::theme_classic()

7h - Training methodologies of the volunteers

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q22) %>% 
  dplyr::group_by(Q22) %>%
  dplyr::summarise(numProj = n()) %>% 
  dplyr::filter(!is.na(Q22)) %>% 
  dplyr::filter(numProj > 4) %>% 
  ggplot2::ggplot(ggplot2::aes(x = Q22, y = numProj)) +
  ggplot2::geom_bar(stat = "identity", fill = "yellow4") +
  ggplot2::xlab("") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::theme_classic() + ggplot2::theme(axis.text.x = element_text(size = 4, angle = 0, vjust = 0.2),
                                   axis.text.y = element_text(size = 8))

7i - Data type

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>%  
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q23, Q23_6_TEXT) %>% 
  dplyr::group_by(Q23) %>%
  dplyr::summarise(numProj = n()) %>% 
  dplyr::filter(!is.na(Q23)) %>% 
  dplyr::filter(numProj > 3) %>%
  ggplot2::ggplot(ggplot2::aes(x = Q23, y = numProj)) +
  ggplot2::geom_bar(stat = "identity", fill = "blueviolet") +
  ggplot2::xlab("") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 1)) +
  ggplot2::theme_classic() + ggplot2::theme(axis.text.x = element_text(size = 5, angle = 0, vjust = 0.3),
                                   axis.text.y = element_text(size = 8))

7i - Quality check

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>%  
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q25:Q25_9_TEXT) %>% 
  dplyr::group_by(Q25) %>%
  dplyr::summarise(numProj = n()) %>% 
  dplyr::filter(!is.na(Q25)) %>% 
  dplyr::filter(numProj > 3) %>%
  ggplot2::ggplot(ggplot2::aes(x = Q25, y = numProj)) +
  ggplot2::geom_bar(stat = "identity", fill = "blueviolet") +
  ggplot2::xlab("") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::theme_classic() 

7j - Ways to share data

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>%  
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q24, Q26:Q26_6_TEXT) %>%
  dplyr::group_by(Q24) %>%
  dplyr::summarise(numProj = n()) %>% 
  dplyr::filter(!is.na(Q24)) %>% 
  ggplot2::ggplot(ggplot2::aes(x = Q24, y = numProj)) +
  ggplot2::geom_bar(stat = "identity", fill = "chocolate4") +
  ggplot2::xlab("") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::theme_classic() 

7j - Ways to share findings

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q24, Q26:Q26_6_TEXT) %>% 
  dplyr::group_by(Q26) %>%
  dplyr::summarise(numProj = n()) %>% 
  dplyr::filter(!is.na(Q26)) %>% 
  dplyr::filter(numProj > 3) %>%
  ggplot2::ggplot(ggplot2::aes(x = Q26, y = numProj)) +
  ggplot2::geom_bar(stat = "identity", fill = "chocolate4") +
  ggplot2::xlab("") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::theme_classic() + ggplot2::theme(axis.text.x = element_text(size = 8, angle = 0, vjust = 0.3),
                                   axis.text.y = element_text(size = 8))

7k - Ways to acknowledge participants

Show code
dataset %>% 
  dplyr::filter(as.numeric(ProgressCS) >= 75) %>% 
  dplyr::filter(Q10 > 0) %>% 
  dplyr::select(Q10, Q27) %>%
  dplyr::group_by(Q27) %>%
  dplyr::summarise(numProj = n()) %>% 
  dplyr::filter(!is.na(Q27)) %>%
  dplyr::filter(numProj > 3) %>%
  ggplot2::ggplot(ggplot2::aes(x = Q27, y = numProj)) +
  ggplot2::geom_bar(stat = "identity", fill = "chocolate4") +
  ggplot2::xlab("") + ggplot2::ylab("Number of projects") +
  ggplot2::geom_text(ggplot2::aes(label = numProj), vjust = 1.6, color = "white", size = 3.5) +
  ggplot2::scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 10)) +
  ggplot2::theme_classic()

7l - Geographical distribution of the ILTER initiatives

Show code
# listOfAllSites <- ReLTER::get_ilter_generalinfo()
# saveRDS(listOfAllSites, file = "ilter_sitesData.rds")
listOfAllSites <- readRDS(file = "ilter_sitesData.rds")
# remove the sites without geometry
listOfAllSites <- listOfAllSites[c(1:1226, 1228:1237, 1239:1240, 1242:1243, 1248:1249), ]

siteWithDeimsId <- dataset %>% 
  dplyr::select(Q30) %>% 
  .[-160,] %>% 
  dplyr::filter(Q30 != "NA") %>% 
  dplyr::add_row(Q30 = c(
    "https://deims.org/664177a4-a21a-4f59-9601-00909e275868",
    "https://deims.org/5a38fc08-5257-4b13-8465-1d50ea166b95",
    "https://deims.org/96ba6c55-a555-4e96-a3e6-14d6dfe8785b",
    "https://deims.org/923cb154-83c9-444d-817a-cde7879c09b5"
  )) %>% 
  unique() # 84 DEIMS.iD
sitesOnSurvey <- listOfAllSites[listOfAllSites$uri %in% siteWithDeimsId$Q30, ] # 84 sites compared with ILTER formal sites

# collect biogeographicalRegion and biome from DEIMS site
sitesOnSurveyEnvChar <- lapply(
  as.list(sitesOnSurvey$uri),
  FUN = function(x) {ReLTER::get_site_info(x, category = c("EnvCharacts"))}
) %>% 
  dplyr::bind_rows() %>% 
  dplyr::select(uri, envCharacteristics.biogeographicalRegion, envCharacteristics.biome)
  
sitesOnSurvey_2 <- merge(x = sitesOnSurvey, y = sitesOnSurveyEnvChar, by.x = "uri", by.y = "uri", all = T)

biomeNum <- sitesOnSurvey_2$envCharacteristics.biome[-68] %>% unique() %>% length()
getPalette <- grDevices::colorRampPalette(RColorBrewer::brewer.pal(12, "Set3"))

# Biome map plot
library("rnaturalearth")
library("rnaturalearthdata")

world <- ne_countries(scale = "medium", returnclass = "sf")
ggplot2::ggplot(data = world) +
  ggplot2::geom_sf() +
  ggplot2::xlab("Longitude") + ggplot2::ylab("Latitude") +
  ggplot2::scale_y_continuous(limits = c(-90, 90), expand = c(0, 0)) +
  ggplot2::scale_x_continuous(expand = c(0, 0)) +
  ggplot2::geom_sf(
    data = sitesOnSurvey_2$geometry[-68], 
    size = 1, 
    ggplot2::aes(
      color = sitesOnSurvey_2$envCharacteristics.biome[-68]
    ),
  ) + # feature 68 missing the information in DEIMS_SDR about the Biome
  ggplot2::scale_fill_manual(getPalette(biomeNum)) +
  ggplot2::ggtitle("iLTER Sites on survey") +
  ggplot2::scale_fill_discrete(name = "New Legend Title")

7l - Biogeographic distribution of the ILTER initiatives

Show code
# Biogeographical Region map plot
nc <- sf::st_read("../TeaBagCatalogue/Maps_export/Zonobiome_poly.shp", quiet = TRUE)
ggplot2::ggplot() +
  ggplot2::scale_y_continuous(limits = c(-90, 90), expand = c(0, 0)) +
  ggplot2::scale_x_continuous(expand = c(0, 0)) +
  ggplot2::geom_sf(data = nc, ggplot2::aes(fill = Legend), lwd = 0) +
  ggplot2::geom_sf(data = sitesOnSurvey_2$geometry[-68], color = "black", size = 1) + # feature 68 missing the information in DEIMS_SDR about the Biome
  ggplot2::ggtitle("iLTER Sites on survey") +
  ggplot2::scale_fill_discrete(name = "Biogeographical Region")

Acknowledgments

This article is created by Distill R package (Dervieux et al. 2022), the tables summary are made with gtsummary R package (Sjoberg et al. 2021).

Dervieux, Christophe, JJ Allaire, Rich Iannone, Alison Presmanes Hill, and Yihui Xie. 2022. Distill: R Markdown’ Format for Scientific and Technical Writing.
Sjoberg, Daniel D., Karissa Whiting, Michael Curry, Jessica A. Lavery, and Joseph Larmarange. 2021. “Reproducible Summary Tables with the Gtsummary Package.” The R Journal 13: 570–80. https://doi.org/10.32614/RJ-2021-053.

References

Citation

For attribution, please cite this work as

Oggioni & Bergami (2022, June 18). Statistical analysis for exploring environmental Citizen Science practices at ILTER. Retrieved from https://github.com/oggioniale/xxx

BibTeX citation

@misc{oggioniCSI2022,
  author = {Oggioni, Alessandro and Bergami, Caterina},
  title = {Statistical analysis for exploring environmental Citizen Science practices at ILTER},
  url = {https://github.com/oggioniale/xxx},
  year = {2022}
}